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Abstract 

A software library is presented for the polynomial expansion method (PEM) of the 
density of states (DOS) introduced in Ref [1, 2]. The library provides all necessary 
functions for the use of the PEM and its truncated version (TPEM) in a model 
independent way. The PEM/TPEM replaces the exact diagonalization of the one 
electron sector in models for fermions coupled to classical fields. The computational 
cost of the algorithm is 0(N) - with N the number of lattice sites - for the TPEM [3] 
which should be contrasted with the computational cost of the diagonalization tech- 
nique that scales as 0(N 4 ). The method is applied for the first time to a double 
exchange model with finite Hund coupling and also to diluted spin-fermion models. 
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PROGRAM SUMMARY 



(Computer program available on request) 
Title of library: TPEM 
Distribution Format: tar gzip file 
Operating System: Linux, UNIX 
Number of Files: 4 plus 1 test program 

Keywords: Moment expansion, Monte Carlo method, Correlated 
electrons 

Programming language used: C 
Computer: PC 

Typical running time: The test program takes a few seconds to run. 
External routines: The LaPack library can be used by the test program 
and the GSL library can be used by TPEM though they are not essential. 
Nature of the physical problem: The study of correlated electrons coupled 
to classical fields appears in the treatment of many materials of much current 
interest in condensed matter theory, e. g. manganites, diluted magnetic semi- 
conductors and high temperature superconductors among others. 
Method of solution: Typically an exact diagonalization of the electronic sec- 
tor is performed in this type of models for each configuration of classical fields, 
which are integrated using a classical Monte Carlo algorithm. A polynomial 
expansion of the density of states is able to replace the exact diagonalization, 
decreasing the computational complexity of the problem from 0{N i ) to O(N) 
and allowing for the study of larger lattices and more complex and realistic 
systems. 

Key References: 0, 0, B, . 



1 Introduction 

The problem of fermions coupled to classical fields appears in many contexts 
in condensed matter physics. In this kind of problems the fermionic opera- 
tors appear in the Hamiltonian involving only quadratic terms. They can be 
solved |4J by diagonalizing the fermions exactly in the one-electron sector at 
finite temperature for a given configuration of classical fields. The classical 
fields are integrated by means of a classical Monte Carlo algorithm. The pro- 
cedure, apart from being exact within the error bars, preserves the lattice 
throughout the calculation making it possible to study the spatial dependence 
of the observables. This is a crucial issue to understand inhomogeneities and 
has been successfully applied to the study of many materials p. In the case 
of manganites such models have been used to understand the phase diagram 
of these materials as well as the colossal magnetoresistance effect i. e. 
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the colossal response of the system to magnetic fields, that could have many 
important applications. In this case, the classical field is the local t^g spin. 
Diluted magnetic semiconductors have also been studied in a similar way 0. 
The inhomogeneities that appear there in the form of ferromagnetic clusters, 
could only be accessed by the use of this method 0| ■ In addition, a model for 
high temperature superconductors has been pressed g to study the compe- 
tition between d-wave superconductivity and antiferromagnetism that seems 
to explain interesting properties of these complex materials. 
Despite all these many advantages of the method, it is still very costly in 
terms of computational effort. Indeed, the method scales as order 0(N 4: ) and 
the largest lattices that can be accessed in a practical way contain no more 
than 6 3 sites or its equivalent in lower dimensions. This imposes limitations 
on the kind of physical systems that can be studied, for example, the Mn spin 
concentration in diluted semiconductors has to be high enough, the study of 
many band systems becomes difficult, etc. 

Trying to solve some of these problems, two of the authors (N.F. and Y.M.) 
proposed 0,0] in 2001 a procedure that replaces the exact diagonalization of 
the one-electron sector by a series expansion of the density of states in terms 
of Chebyshev polynomials. The method takes advantage of the sparseness of 
the Hamiltonian matrix (which is the case in virtually all systems of physical 
interest) to perform the matrix- vector products that appear in the calculation 
of the terms or moments of the expansion. In what follows, this method will be 
referred to as the polynomial expansion method or PEM. In 2003 an improve- 
ment of the PEM was proposed, [^] based on two controllable approximations 
that, as will be seen, do not diminish in any way the quality of the results. The 
first of these approximations is the truncation of each matrix-vector multipli- 
cation, including only products that are larger than a certain threshold. The 
second one is the truncation of the difference in Boltzmann probability weight 
or action between two very similar configurations of classical fields. This dif- 
ference appears in the Monte Carlo procedure with enormous frequency and 
so its truncation turns out to be very effective. This new truncated PEM will 
be referred to as TPEM. 

In this paper we present a C library that implements both the PEM and 
TPEM. The library is model independent and basically takes as input the 
Hamiltonian matrix of practically any model of fermions coupled to classical 
fields. To our knowledge no such library is presently available but its useful- 
ness is evident: the TPEM can be easily separated from other details of the 
main program(s) and users do not have to be concerned with the technicalities 
of the method. In this sense the library presented here places the TPEM at 
the same level of the exact diagonalization. 

Another algorithm for the study of spin-fermion models on large lattices is 
the "Hybrid Monte Carlo Algorithm", that has been applied to the double- 
exchange model with infinite coupling 0. In this method the model is for- 
mulated in the path integral representation, introducing imaginary time, and 
a Hybrid Monte Carlo (HMC) procedure is used to evolve the system. The 
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TPEM seems to work best than the HMC at low temperature where the HMC 
presents increasing computational cost due to the time discretization. Further- 
more, the HMC is applicable when the bands are connected and do not extend 
over a wide range of energies, as is the case of finite coupling systems. The 
TPEM also allows for easy parallelization, improving the performance even 
more. 

The paper is divided as follows. Section 2 explains the theory underlying the 
TPEM. In Section 3 the implementation details and the functions provided 
by the library are described. Section 4 shows some simple examples on how 
to use the library. Finally, in Section 5, the TPEM is applied to a model for 
manganites with finite coupling and also to diluted spin-fermion systems. 
It is important to emphasize that the application of the TPEM to finite Hund 
coupling and diluted systems is novel and shows that the method is suitable 
to study both systems with disconnected bands and systems with impurity 
bands. 



2 Theoretical Overview 

The analysis starts with a model defined by a certain Hamiltonian, H = 
J2ijap c l a Hiaj/3(4>)cjf3, where the indices i and j denote a spatial index and a 
and (3 some internal degree(s) of freedom, e. g. spin or orbital. The Hamil- 
tonian matrix depends on the configuration of one or more classical fields, 
represented by <fi. Although no explicit indices will be used, the field(s) <fi 
are supposed to have a spatial dependence. The partition function for this 
Hamiltonian is given by: 

Z = f ' d<f> X>|exp(-/3iy ((/))+ PnN)\n) (1) 

where \n) are the eigenvectors of the one-electron sector. To calculate the 
observables, an arbitrary configuration of classical fields is selected as a 
starting point. The Boltzmann weight or action of that configuration, S ((/>), 
is calculated by diagonalizing the one-electron sector at finite temperature. 
Then a small local change of the field configuration is proposed, so that the 
new configuration is denoted by <fi' and its action is recalculated to obtain the 
difference in action AS = S(4>') — S(<f)) . This new configuration is accepted or 
rejected based on a Monte Carlo algorithm like Metropolis or heat bath and the 
cycle starts again. In short, the observables are traditionally calculated using 
exact diagonalization of the one-electron sector at every spin "flip" and Monte 
Carlo integration for the classical fields 0]. The PEM/TPEM substitutes the 
diagonalization for a polynomial expansion and the details are presented in 
Ref. 

It will be assumed that the Hamiltonian H((p) is normalized, which simply 
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implies a re-scaling: 



H^(H-b)/a 

b = (E max + E min )/2, (2) 

where i?^ and E min are the maximum and minimum eigenvalues of the 
original Hamiltonian respectively. This in turn implies that the normalized 
Hamiltonian has eigenvalues e v G [—1,1]. The values of E max and E min depend 
on the particular Hamiltonian under consideration and should be calculated 
in advance. 

The observables that can be calculated directly 2 with the TPEM fall into 
two categories: (i) those that do not depend directly on fermionic operators, 
e. g. the magnetization, susceptibility and classical spin-spin correlations in 
the double exchange model and (ii) those for which a function F{x) exists 
such that they can be written as: 

A(<j>) = j F{x)D{<j>,x)dx, (3) 

where D(<f),e) = ^2 l/ S(e((f)) — e„), and e u are the eigenvalues of H(<f>), i. e. 
D(<p,x) is the density of states of the system. For the former, the calculation 
is straightforward and simply involves the average over Monte Carlo configu- 
rations. For the latter, a function F(x) can be expanded in terms of Chebyshev 
polynomials in the following way: 



F{*) = E fmT m (x), (4) 

m=0 

where T m (x) is the m— th Chebyshev polynomial evaluated at x. Let a m = 
2 — 6 m> o. The coefficients f m are calculated using the formula: 

f m = J a m F(x)T m (x)/ (7iVl - x 2 ). (5) 
The moments of the density of states are defined by: 

N dirn 

Vm{<P)= J2(v\T m (H(<P))\"), (6) 



2 In principle, it would be possible to calculate more complicated observables 
by expanding not only the density of states but also off-diagonal elements, e. g. 



5 



where Ndim is the dimension of the one-electron sector. Then, the observable 
corresponding to the function F(x), can be calculated using: 



^)=E/m^W ( 7 ) 



In practice, the sum in Eq. (7) is performed up to a certain cutoff value M, 
without appreciable loss in accuracy as described in Ref. 0,0]. The calculation 
of n m is carried out recursively. \v\ m) = T m (H)\u) = 2H\v; m — 1) — \v; m — 2) 
and hence: 



A*2m = J2((™>; v\v\ m) - 1) 

v 

^2m+i = J2((m; v\v\ m) + 1 - (u; 0\u; 1)), (8) 



are used to calculate the moments. The process involves a sparse matrix- vector 
product, e. g. in T m (H)\v), yielding a cost of 0(N 2 ) for each configuration, 
i. e. 0(N 3 ) for each Monte Carlo step. In addition, an improvement of the 
present method has been proposed [3( based on a truncation of the matrix- 
vector product mentioned before and it turns out that the resulting algorithm 
has a complexity linear in N. This approximation is controlled by the small 
parameter e pr . 

Moreover, for the Monte Carlo integration procedure, the difference in action, 
AS = S((f>') — S(4>) has to be computed at every step. Since this operation re- 
quires calculating two set of moments, for and 0', the authors of Ref. j^] have 
also developed a truncation procedure for this trace operation controlled by a 
small parameter, e tT . This truncation is based on the observation that if <fi and 
(f)' differ only in very few sites then the corresponding moments will differ only 
for certain indices allowing for a significant reduction of the computational 
complexity. The TPEM library presented here implements this truncation as 
well. 

In what follows, the size of the Hilbert space will be denoted by Nfo m and it 
will depend on the size of the lattice as well as on the particular model to be 
studied. For a one-band double exchange model on a lattice of N sites and 
finite coupling, Ndim = 2iV; the factor of 2 accounts for the spin degree of 
freedom. 
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3 The Library 



3. 1 Implementation 



The code is written in C and can be called from a C or C++ program. If 
the library is compiled statically the file libtpem.a is produced. To use the 
functions provided by the library the header file "tpem.h" has to be included. 
In the code, complex numbers are simply represented by the structure: 
typedef struct { double real, imag; } tpem_t; 

As mentioned before, matrix-vector multiplications must be performed in a 
sparse way, i. e., multiplications that yield a null result must be avoided for 
efficiency The structure tpem_sparse, defined in tpem_sparse.c, implements a 
sparse matrix in compressed row storage (CRS) format. The CRS format puts 
the subsequent nonzero elements of the matrix rows in contiguous memory 
locations. We create 3 vectors: one for complex numbers containing the values 
of the matrix entries and the other two for integers (colind and rowptr). The 
vector values stores the values of the non-zero elements of the matrix, as 
they are traversed in a row-wise fashion. The colind vector stores the column 
indices of the elements of the values vector. That is, if values[k] = a[i][j] then 
colind[k] = j. The rowptr vector stores the locations in the values vector 
that start a row, that is values[k] = a[i][j] if rowptr[i] < i < rowptr[i + 1]. 
By convention, we define rowptr[Ndi m ] to be equal to the number of non-zero 
elements, n z , in the matrix. The storage savings of this approach are significant 
since instead of storing N% im elements, we need only 2n z + Ndi m + 1 storage 
locations. 

To illustrate how the CRS format works, consider the non-symmetric matrix 
defined by 

10 -2 
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2 



3 



13 
-1 



(9) 



The CRS format for this matrix is then specified by the arrays: 
values = [10 -2 3 9 3 7 8 7 3 ... 9 13 4 2 -1 ] 
colind = [040151230 ... 45145] 
rowptr = [ 2 5 8 12 16 19 ] 

Besides the obvious saving in storage, CRS format allows for a model in- 
dependent library implementation and an easy algorithm for matrix-vector 
multiplication as shown in Fig. 1. 
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void tpem_sparse_mult (sparse *matrix, tpem_t *dest, 

tpem_t *src) 

{ 

size_t row, col, k; 
tpem_t tmp; 

for (row = 0; row < matrix->rank; row++) { 
sum = 0.0; 

for (k=matrix->rowptr [row] ;k<matrix->rowptr [row+1] ;k++) { 
col = matrix->colind [k] ; 
sum += matrix->values [k] * src[col]; 

} 

dest [row] = sum; 

} 

} 



Fig. 1. Matrix- vector multiplication function using the CRS format, src contains 
the vector to be multiplied and the results are stored in dest. 

The truncations in the matrix-vector product and in the action difference 
are calculated with the aid of tpemjsubspace.c which implements a simple 
stack. The stack is used to hold a "subspace" of kets of the one-electron 
Hilbert space that grows dynamically. It is in this subspace that matrix- vector 
multiplications are performed instead of using the complete Hilbert space. 

3.2 Functions provided by the Library 

1. void tpem_init(); 

Description: It must be called before using the library. 

2. void tpem_calculate_coef f s (size_t M, double *coeffs, 
double (*G) (int, double) ) ; 

Description: It calculates f m using Eq. (5). 

Arguments: 

• M: cutoff (input) 

• coeff s: array of doubles where the coefficients f m , Eq. (4) will be stored, 
(output) 

• double (*F) (double): The function corresponding to the observable that 
we want to expand as given by Eq. (3). The function takes a double and 



8 



returns a double (input). 



3. void tpem_calculate_moment_tpem (tpem_sparse *matrix, size_t M, 
double *moment, double epsProd) ; 

Description: It calculates the moments of the density of states, /i m , as defined 
by Eq. (6). The method used is TPEM as described in 0|. 

Arguments: 

• matrix: the matrix in compressed row storage (input). 

• M: the cutoff (input). 

• moment: array of doubles to store the moments, Eq. (6) (output). 

• epsProd: the tolerance for the matrix- vector product truncation (input). 

4. void tpem_calculate_moment_pem (tpem_sparse *matrix, size_t 
M, double *moment); 

Description: Same as previous but it uses PEM algorithm. 

void tpem_calculate_moment_dif f _tpem (tpem_sparse *matrixO, tpem_sparse 
*matrixl, size_t M, double *moments, size_t n_support, size_t *support, 
double epsTrace, double epsProd); 

Description: It calculates the difference in moments for two matrices, using 
the trace truncation algorithm. 

Arguments: 

• matrixO: The first matrix (input). 

• matrixl: The second matrix (input). 

• M: the cutoff (input). 

• moments: array of doubles to store the difference in moments (output). 

• n_support: Number of entries where the two matrices differ (input). 

• support: Vector containing the column index of the entries where the two 
matrices differ. For example, in the double exchange model with finite cou- 
pling if site % is being updated then support= [i , i+N] where N is the number 
of sites (input). 

• epsTrace: The tolerance for the trace truncation algorithm (input). 

• epsProd: The tolerance for the matrix-vector multiplication truncation (in- 
put). 

Note that this function does not calculate the moment difference for any two 
matrices, only for matrices that differ in indices specified by n_support and 
support as explained above. 

5. void tpem_calculate_moment_diff_pem (tpem_sparse *matrixO, tpem_sparse 
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*matrixl, size_t M, double *moments); 

Description: It calculates the difference in moments for two matrices without 
approximations. This function is provided for easy integration of PEM and 
TPEM algorithms. Since there is no truncation, the support array, its size 
and the tolerances are not needed. 

6. double tpem_expansion (size_t M, double *moments, double *co- 
effs) 

Description: Given f m and /i m calculates A((f>) as given by Eq. (7). 

7. void tpem_f inalizeO ; 

Description: It can be called to free the resources used by the library and reset 
all input. 



4 Simple Examples 

4-1 Calculating an Integral 

To illustrate the use of the library several integrals will be calculated based 
on a density of states, D(x), given by the one-dimensional spinless double 
exchange model with random potentials, Vi, whose Hamiltonian is: 

H = -t J2 i^W + E^W. (io) 

<ij>,a i 

The complete code discussed in this section is provided in the file tpem_test.c. 
The most important steps will be explained here. The matrix used is 400 x 400, 
is calculated in the CRS format and normalized so that its eigenvalues are 
in the interval [—1, 1] as explained in Section 2 and in Eq. (2). The library 
must be initialized by calling the function tpem_init. Let us define E(x) = 
5.0x(1.0 — tanh(lO.Ox)) 3 and calculate j\ D(x)E(x)dx applying both exact 
diagonalization and the TPEM. In tpem_test.c the diagonalization is done 
by calling the function diag_apply. Next the integral is performed using the 
TPEM for different values of the cutoff, M, and fixed e pr = 10~ 5 and e tr = 10~ 3 
in the function tpem_apply. The code is self explanatory and shows the ease 
of use of the library: First, the coefficients need to be calculated using: 
tpem_calculate_coef f s (cutoff, coeffs, f uncptr) ; 
Next, the moments are obtained by calling: 

tpem_calculate_moment_tpem (matrix, cutoff, moment, eps); 

3 This function emulates an energy function for a fermionic system. 
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<E(x)> E(x)=5x[1-tanh(10x)] 
-780 



<AS(x)> S(x)=log(1+e 20x ) 
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Fig. 2. (a) {E(x)} calculated using the TPEM with the parameters shown vs. the 
cutoff, M. The exact value is also indicated, (b) (AS(x)) calculated using the TPEM 
with and without trace truncation vs. the cutoff, M. In the former case et r = 0.001 
and in the latter ej r = 0. The exact value is also indicated. 

Finally, the integral is calculated simply by multiplying the moments times 
the coefficients: ret = tpem_expansion (cutoff, moment, coef f s) ; 
The output of the program is presented at the end and in Fig. 2a. Similarly, 
other integrals are calculated in tpem_test.c with the function N(x) = 0.5(1.0— 
tanh(lO.Ox)). In both cases, it can be seen that after M ~ 30 the results agree 
with the ones obtained by applying the traditional diagonalization method. 
Moreover, if only M even is considered then the convergence for E(x) is 
achieved for a much smaller value of M, namely M ~ 10. 



4-2 Using Trace Truncation 



The last part of tpem_test.c tests the trace truncation. Consider two matri- 
ces corresponding to a one-dimensional spinless double exchange model with 
random potentials, Eq. (10), that differ only in the value of the potential at 
the first site. Consider the function, S(x) = log(1.0 + exp(— 10. Ox)), which 
emulates the action. The testing program calculates the difference in S(x) for 
both matrices in three ways: (i) with the exact diagonalization, (ii) by using 
the TPEM without trace truncation and (iii) by using trace truncation. The 
last two results are parameterized in terms of M and both assume e pr = 10 -5 
whereas = when no trace truncation is used and ej r = 10 -3 in the second 
case. 

The results are presented in Fig. 2b. Both TPEM calculations agree with the 
exact diagonalization after M « 20. 

The dependence of the quality of the results on e pr is shown in Fig. 3 for the 
function E(x) where it can be seen that e pr = 10~ 3 is enough to obtain very 
high accuracy for this model. However, for the systems that will be described 
in the next section we have found that e pr should be as small as 10 -5 for the 
results to be accurate, and this value will be used in the rest of this work. 
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Fig. 3. TPEM calculated {E) for different values of e pr . The x— axis is given in 
logarithmic scale. The value of (E) from the exact diagonalization technique is also 
indicated. 

5 Advanced Applications: TPEM and Monte Carlo 



5. 1 Double Exchange Model at Finite Coupling 



Double exchange models appear in the description of the colossal magnetore- 
sistance effect (CMR) in manganites where the electron-phonon coupling and 
the Coulomb interactions are usually neg lectedQ. These models can correctly 
produce ferromagnetic phases as long as the Hund coupling is large enough. In 
this case, the electrons directly jump from manganese to manganese spin and 
their kinetic energy is minimized if these spins are aligned. The Hamiltonian 
of the system in the one-band approximation can be written as [Tli ll2T|: 

H = -t d t d j° -J^Si-ffi, (11) 

<ij>,<r 

where c\ a creates a carrier at site i with spin a. The carrier-spin operator inter- 
acting ferromagnetically with the localized Mn-spin Si is cr, = Y^ap&ia&aflCip- 
On a cubic lattice of dimension D the largest and smallest spectrum bounds 
of Hamiltonian Eq. (11) are E m i n = —2tD — J and E max = —2tD + J, respec- 
tively. To test the TPEM for this physical model, we start with the interesting 
case of a ferromagnetic to paramagnetic transition. The coupling J is chosen 
to be J = 8t and the electronic density is adjusted with \i = —8t to obtain 
a quarter filling, i. e. n = 0.5. We have performed 1,000 thermalization and 
1,000 measurements which were enough to achieve both convergence and small 
errors. The results for the magnetization of the system, defined as 

|M| = ^|E^I (12) 

i 

are shown in Fig. 4 and compared to the traditional exact diagonalization 
calculation, where the high accuracy of the method can be seen clearly. We 
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Fig. 4. Magnetization vs. temperature, T, on a 4 3 lattice for model Eq. (11), calcu- 
lated using both exact diagonalization and the TPEM with the indicated parame- 
ters. The maximum possible value of \M\ is 1. 

repeat the calculations for larger lattices and also measure the magnetic sus- 
ceptibility, X; as a function of the temperature (see Fig. 5). 
The boundary conditions used are anti-periodic in one direction and periodic 
in the other two. This is a numerical trick in the sense that it is an effec- 
tive way to lift the degeneracy due to small size lattices. This degeneracy 
affects the form of the density of states making it difficult to expand it when 
performing simulations on 4 3 . The effect is less and less relevant as size in- 
creases. Moreover, the choice of boundary conditions does not matter in the 
thermodynamic limit. Twisted boundary conditions has been used extensively 
in numerical simulations 

The CPU times to perform these computations are shown in Fig. 6a making 
use of a conventional cluster of linux PCs with 3.06GHz of clock frequency 
each. Even using commodity PCs the CPU user time to perform calculations 
on the largest cluster studied, 10 3 , was less than 3 days. The results show 
that the CPU time scales linearly with the size of the system as predicted by 
the theory Moreover, the algorithm can be parallelized. This is because 
calculation of the moments in Eq. (6) is completely independent for each basis 
ket \v). In this way the basis can be partitioned in such a way that each 
processor calculates the moments corresponding to a portion of the basis. 
The CPU time to calculate the moments on each processor is proportional to 
Ndim/NpE, where Npe is the number of processors. It is important to remark 
that the data to be moved between different nodes are small compared to 
calculations in each node, the communication time is proportional to MNpp- 
Communication among nodes is mainly done here to add up all the moments. 
A version of the TPEM library that supports parallelization will be available 
in the future. 
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Fig. 5. TPEM calculated (a) Magnetization vs. T and (b) x vs - T for the lattices 
and parameters indicated. 
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Fig. 6. (a) CPU Times for the TPEM algorithm applied to model Eq. (11) using 
2,000 Monte Carlo steps, M = 30, e pr = 10~ 5 , e tr = 10" 3 on a 3.06GHz PC. 
The linear fit yields y = 0.0782a; — 3.3271. The CPU times required for the exact 
diagonalization procedure are also shown, (b) \M\ vs. T for size-extrapolated data 
(lattices used: 4 3 , 6 3 , 8 3 and 10 3 ) and estimation of the critical exponent (3. 

The value of Tc obtained from the x vs - T curves is approximately Tr, = 0.12t 
at J = 8t in very good agreement with previously calculated values |la,lla|. In 
addition, we calculated the scaling coefficient (3 defined by \M\ oc {Tc — T) 13 , 
after having made a size-extrapolation, i. e., after taking the thermodynamic 
limit. The result is shown in Fig. 6b and is within the error margin of the value 
given for the Universality class of the Heisenberg model. More information 
about the determination of critical exponents can be found in Ref. [l7| . 



5.2 Diluted Systems 



The Hamiltonian for the diluted spin-fermion model that will be consid- 
ered here is given by Eq. (11) except that the exchange term is replaced by 
JJ2i£i Si ■ <?j, i. e. localized spins are present on only a subset / of the lattice 
sites. Through nearest-neighbor hopping, the carriers can hop to any site of the 
square or cubic lattice. In the same way as in the case of the double-exchange 
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Fig. 7. \M\ vs. T/t on a 4 3 lattice calculated using the TPEM with the parameters 
indicated. The result given by the diagonalization method is also shown. 10,000 
thermalization Monte Carlo iterations and 10,000 measurements were performed. 
The same random configuration for the spatial location of the classical spins was 
considered in all cases. 

model, for diluted systems we have used periodic boundary conditions for 
faster convergence. These are specified by the phases (vr/4, ir/2, 37r/4) in the 
x, y and z directions respectively. 

It is important to remark that Monte Carlo calculations on diluted spin sys- 
tems with concentration x < 1 are 1/x times faster than the concentrated case 
since there are less sites with spins to propose a spin change. 
In this case the density of states will have a more complicated shape, usually 
including a small impurity band. Therefore, it is interesting to see whether the 
TPEM is capable of treating this case. The comparison is provided in Fig. 7 
for a concentration of 32 spins on a 4 3 lattice with approximately 16 elec- 
trons, where it can be seen that the TPEM algorithm converges for M = 40, 
e tr = 10~ 7 and e pr = 10~ 5 . This simple test shows that even in the case of 
systems with impurity bands and positional disorder, the expansion yields re- 
sults compatible with the exact treatment. Therefore, there is much potential 
for the use of this technique in the area of diluted magnetic semiconductors. 

5.3 Convergence 

The expansion parameters required for convergence, i. e. the cutoff M and 
the thresholds e tr and e pr , can be calculated on a small lattice where the exact 
diagonalization technique can be used to check the T/PEM algorithm. Since 
these numbers do not depend on the size of the system (only on the model, 
see jH) then they can be safely used on larger lattices. This is shown in Fig. 7 
where unlike for the concentrated system in this case neither M = 30 nor 
e tr = 10~ 5 is enough for convergence but M = 40, e tr = 10~ 7 and e pr = 10~ 5 
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is required. On the other hand, the double-exchange model, Eq.(ll), with 
infinite J (not studied here but discussed in [3]) converges with a cutoff 
smaller than M = 30. This is because the finite coupling system density-of- 
states is composed of two disconnected bands separated by approximately 2 J 
and so the spectrum extends over a wide range of energies whereas in the 
infinite coupling system there is a single connected band resulting in a faster 
convergence. 

Therefore, the reader and user of the TPEM library should not assume that 
the values presented in the previous examples will guarantee convergence for 
a particular model but should instead perform a check similar to the one 
presented in this section to determine the minimum value of the cutoff M and 
the maximum values of e tr and e pr required for convergence. 



6 Conclusions 

In summary, we have provided a software library that implements the TPEM 
for fermion systems coupled to classical fields. This library will allow theorists 
to study a variety of systems employing the TPEM at the same level that, for 
instance, the LAPACK library [19( is being used for exact diagonalization. 
The TPEM has an enormous potential. For example, studies of diluted mag- 
netic semiconductors that had not been possible before with more than one 
band will now be accessible and the results will be presented elsewhere. These 
studies are crucial to understand the properties of magnetic semiconductors 
and will help in the search for similar compounds with higher Curie temper- 
atures. These high-Curie temperature compounds would in turn be useful for 
technological applications, for example in the fabrication of spin electronic 
or spintronic devices |2(|. The possibility of studying larger systems will not 
merely imply a better estimation of the physical observables but will allow for 
the study of more complex systems like transition metal oxides with realistic 
bands and nanostructures. 
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8 Test Run Output 

****** TESTING TRUNCATED POLYNOMIAL EXPANSION ********** 

This testing program calculates model properties in two ways: 

(i) Using standard diagonalization 

(ii) Using the truncated polynomial expansion method 

All tests are done for a nearest neighbor interaction with 
random (diagonal) potentials. 



TEST 1: MEAN VALUE FOR THE FUNCTION: 

N(x) = 0.5 * (1.0 - tanh (10.0 * x)) 

** Using diagonalization <N>=195 . 349187 

** Using TPEM <N>=(cutof f — > infinity) lim<N_cutof f > where <N_cutoff> is 
cutoff <N_cutof f >%Error (compared to diag.) 



10 


194 


,740524890737 





,311577% 


11 


194, 


,921830592607 





,218765% 


12 


194, 


,265186764692 





, 554904% 


13 


194, 


, 136543222392 





, 620757% 


14 


194, 


,315979389497 





, 528903% 


15 


194, 


,408557066075 





,481512% 


16 


194, 


,679187298809 





, 342975% 


17 


194, 


,612068307437 





, 377334% 


18 


195, 


,036519662839 





, 160056% 


19 


195, 


,085374546781 





, 135047% 


20 


195, 


,361097093739 





, 006097% 


21 


195, 


,325459995073 





,012146% 


22 


195, 


,550660927328 





,103135% 


23 


195, 


,576686770534 





,116458% 


24 


195, 


,568643202933 





,112340% 


25 


195, 


,549624351656 





, 102605% 


26 


195, 


,521318469307 





,088115% 


27 


195, 


,535221730693 





, 095232% 


28 


195 


,475638686769 





,064731% 
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29 


195 


,465473123028 





, 059527% 


30 


195, 


,430165380516 





, 041453% 


31 


195, 


,437598886027 





, 045258% 


32 


195, 


,398432059641 





, 025209% 


33 


195, 


,392996060833 





, 022426% 


34 


195, 


,392858690195 





, 022356% 


35 


195, 


,396834086245 





,024391% 


OD 




, oouy / oooiooo 


u 




37 


195, 


,378071566221 





,014786% 


38 


195, 


,355235557817 





, 003096% 


39 


195, 


,357361742974 





, 004185% 


40 


195, 


,345473742639 





,001901% 



TEST 2: MEAN VALUE FOR THE FUNCTION: 

E(x) = 5.0 * x * (1.0 - tanh (10.0 * x)) 

** Using diagonalization <E>=-802 . 327051 

** Using TPEM <E>=(cutof f — > infinity) lim<E_cutof f > where <E_cutoff> is 
cutoff <E_cutof f >%Error (compared to diag.) 
(OUTPUT OMITTED, SEE FIG 2a) 



TEST 3: MEAN VALUE AND DIFFERENCE FOR THE FUNCTION: 
S(x) = log (1.0 + exp (-20.0 * x)) 

** Using diagonalization <S [matrixO] >= 854.249004021717 

** Using diagonalization <S [matrixl] >= 846.624672679166 

** Using diagonalization <S [matrixl] >-<S [matrixO] >=7 . 624331342551 

** Using TPEM <S>=(cutof f — > infinity) lim<S_cutof f > 

cutoff Delta_S_cutof f S_cutof f [dif f ] Error (to diag.) 

(OUTPUT OMITTED , SEE FIG 2b) 
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